Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems

نویسندگان

  • Arthur Chan
  • Mosur Ravishankar
  • Alexander I. Rudnicky
  • Jahanzeb Sherwani
چکیده

Large vocabulary continuous speech recognition systems are known to be computationally intensive. A major bottleneck is the Gaussian mixture model (GMM) computation and various techniques have been proposed to address this problem. We present a systematic study of fast GMM computation techniques. As there are a large number of these and it is impractical to exhaustively evaluate all of them, we first categorized techniques into four layers and selected representative ones to evaluate in each layer. Based on this framework of study, we provide a detailed analysis and comparison of GMM computation techniques from the four-layer perspective and explore two subtle practical issues, 1) how different techniques can be combined effectively and 2) how beam pruning will affect the performance of GMM computation techniques. All techniques are evaluated in the CMU Communicator domain. We also compare their performance with others reported in the literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient codebook for fast and accurate low resource ASR systems

Nowadays, speech interfaces have become widely employed in mobile devices, thus recognition speed and power consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. Hence, we propose in this paper novel multi-level Gaussian...

متن کامل

A 40-nm 168-mW 2.4×-real-time VLSI processor for 60-kWord continuous speech recognition

This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a compression–decoding scheme to reduce the external memory bandwidth for Gaussian Mixture Model (GMM) computation and multi-path Viterbi transition units. We optimize the internal SRAM size using the max-ap...

متن کامل

Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition

This paper studies algorithms for reducing the computational e ort of the mixture density calculations in HMM-based speech recognition systems. These likelihood calculations take about 70 85% of the total recognition time in the RWTH system for large vocabulary continuous speech recognition. To reduce the computational cost of the likelihood calculations, we investigate several space partitioni...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Efficient codebooks for fast and accurate low resource ASR systems

Today, speech interfaces have become widely employed in mobile devices, thus recognition speed and resource consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. In this paper, we propose novel multi-level Gaussian selec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004